CorpusReader: designing and querying multi-layer corpora
نویسنده
چکیده
CorpusReader is a framework for creating and querying multi-layer corpora, which contain several levels of analysis (morphology, syntax, semantics, etc.) and which are aimed at observing correlations between these levels. Building, representing and querying multi-layer corpora is complex. CorpusReader’s specificity essentially lies in merging the outputs of existing corpus analysis tools, avoiding the problem of integrating them at the software level. MOTS-CLÉS : corpus multiannotés, linguistique quantitative, linguistique de corpus, XML, graphes d’annotation.
منابع مشابه
Formalising Multi-layer Corpora in OWL DL - Lexicon Modelling, Querying and Consistency Control
We present a general approach to formally modelling corpora with multi-layered annotation, thereby inducing a lexicon model in a typed logical representation language, OWL DL. This model can be interpreted as a graph structure that offers flexible querying functionality beyond current XML-based query languages and powerful methods for consistency control. We illustrate our approach by applying ...
متن کاملQuerying Annotated Speech Corpora
This paper is concerned with querying annotated speech corpora. A growing number of such corpora is currently being created worldwide; however, their usefulness for a wider research community is restricted by the lack of standard tools for creating, editing, annotating, storing and querying them. Two solutions for these problems are presented here: the XML-based data format TASX for corpus crea...
متن کاملGenoQuery: a new querying module for functional annotation in a genomic warehouse
MOTIVATION We have to cope with both a deluge of new genome sequences and a huge amount of data produced by high-throughput approaches used to exploit these genomic features. Crossing and comparing such heterogeneous and disparate data will help improving functional annotation of genomes. This requires designing elaborate integration systems such as warehouses for storing and querying these dat...
متن کاملDeveloping a BIM-based Spatial Ontology for Semantic Querying of 3D Property Information
With the growing dominance of complex and multi-level urban structures, current cadastral systems, which are often developed based on 2D representations, are not capable of providing unambiguous spatial information about urban properties. Therefore, the concept of 3D cadastre is proposed to support 3D digital representation of land and properties and facilitate the communication of legal owners...
متن کاملQuerying Multi-Layer Annotation and Alignment in Translation Corpora
When dealing with linguistically annotated and aligned corpora current research concentrates mainly on the investigation of translation properties. However, annotated and aligned corpora can be useful for practical translation as well, since translators also work with parallel corpora. Translators typically use raw sentence aligned corpora stored in translation memories. In this paper we will s...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- TAL
دوره 49 شماره
صفحات -
تاریخ انتشار 2008